Towards Speaker Detection using Lips Movements for Human-Machine Multiparty Dialogue

نویسندگان

Fasih Haider

Samer Al Moubayed

چکیده

This paper explores the use of lips movements for the purpose of speaker and voice activity detection, a task that is essential in multi-modal multiparty human machine dialogue. The task aims at detecting who and when someone is speaking out of a set of persons. A multiparty dialogue consisting of 4 speakers is audiovisually recorded and then annotated for speaker and speech/silence segments. Lips movements are tracked using the real-time FaceAPI face tracking commercial software. The paper reports on results from 3 classification techniques, namely: neural networks, naïve Bayes classifiers, and Mahalanobis distance. In speech/ silence detection, the experiments show promising results using lips movements with an optimal accuracy of 78.31%. The results also show that the neural network classifier has better results than other techniques in speaker dependent and hybrid method. However in speaker independent method the results show that the naïve Bayes classifier has the best result with accuracy of 64.56%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Speaker Detection using FaceAPI Facial Movements in Human-Machine Multiparty Dialogue

In multiparty multimodal dialogue setup, where the robot is set to interact with multiple people, a main requirement for the robot is to recognize the user speaking to it. This would allow the robot to pay attention (visually) to the person the robot is listening to (for example looking by the gaze and head pose to the speaker), and to organize the dialogue structure with multiple people. Knowi...

متن کامل

Who’s next? Speaker-selection mechanisms in multiparty dialogue

Participants in conversations have a wide range of verbal and nonverbal expressions at their disposal to signal their intention to occupy the speaker role. This paper addresses two main questions: (1) How do dialogue participants signal their intention to have the next turn, and (2) What aspects of a participant’s behaviour are perceived as signals to determine who should be the next speaker? O...

متن کامل

Visual speech detection using OpenCV

Visual information from the human face; lip-movements and tongue provide us with lots of information about the spoken message and helps in understanding the verbal communication. The visual speech detection overcomes some of the persistent problems and inaccuracies encountered by users that creep in when there is background noise. In noisy environment we pay more attention to the lips which dra...

متن کامل

Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres

Automatic summarization of open-domain spoken dialogues is a relatively new research area. This article introduces the task and the challenges involved and motivates and presents an approach for obtaining automatic-extract summaries for human transcripts of multiparty dialogues of four different genres, without any restriction on domain. We address the following issues, which are intrinsic to s...

متن کامل

The furhat social companion talking head

In this demonstrator we present the Furhat robot head. Furhat is a highly human-like robot head in terms of dynamics, thanks to its use of back-projected facial animation. Furhat also takes advantage of a complex and advanced dialogue toolkits designed to facilitate rich and fluent multimodal multiparty human-machine situated and spoken dialogue. The demonstrator will present a social dialogue ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Towards Speaker Detection using Lips Movements for Human-Machine Multiparty Dialogue

نویسندگان

چکیده

منابع مشابه

Towards Speaker Detection using FaceAPI Facial Movements in Human-Machine Multiparty Dialogue

Who’s next? Speaker-selection mechanisms in multiparty dialogue

Visual speech detection using OpenCV

Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres

The furhat social companion talking head

عنوان ژورنال:

اشتراک گذاری